Projector 2: contig mapping for efficient gap-closure of prokaryotic genome sequence assemblies

نویسندگان

  • Sacha A. F. T. van Hijum
  • Aldert L. Zomer
  • Oscar P. Kuipers
  • Jan Kok
چکیده

With genome sequencing efforts increasing exponentially, valuable information accumulates on genomic content of the various organisms sequenced. Projector 2 uses (un)finished genomic sequences of an organism as a template to infer linkage information for a genome sequence assembly of a related organism being sequenced. The remaining gaps between contigs for which no linkage information is present can subsequently be closed with direct PCR strategies. Compared with other implementations, Projector 2 has several distinctive features: a user-friendly web interface, automatic removal of repetitive elements (repeat-masking) and automated primer design for gap-closure purposes. Moreover, when using multiple fragments of a template genome, primers for multiplex PCR strategies can also be designed. Primer design takes into account that, in many cases, contig ends contain unreliable DNA sequences and repetitive sequences. Closing the remaining gaps in prokaryotic genome sequence assemblies is thereby made very efficient and virtually effortless. We demonstrate that the use of single or multiple fragments of a template genome (i.e. unfinished genome sequences) in combination with repeat-masking results in mapping success rates close to 100%. The web interface is freely accessible at http://molgen.biol.rug.nl/websoftware/projector2.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

General method of rapid Smith/Birnstiel mapping adds for gap closure in shotgun microbial genome sequencing projects: application to Pseudomonas putida KT2440.

A physical mapping strategy has been developed to verify and accelerate the assembly and gap closure phase of a microbial genome shotgun-sequencing project. The protocol was worked out during the ongoing Pseudomonas putida KT2440 genome project. A macro-restriction map was constructed by linking probe hybridisation of SwaI- or I-CeuI-restricted chromosomes to serve as a backbone for the quick q...

متن کامل

From sequence mapping to genome assemblies.

The development of "next-generation" high-throughput sequencing technologies has made it possible for many labs to undertake sequencing-based research projects that were unthinkable just a few years ago. Although the scientific applications are diverse, e.g., new genome projects, gene expression analysis, genome-wide functional screens, or epigenetics-the sequence data are usually processed in ...

متن کامل

Shotgun optical maps of the whole Escherichia coli O157:H7 genome.

We have constructed NheI and XhoI optical maps of Escherichia coli O157:H7 solely from genomic DNA molecules to provide a uniquely valuable scaffold for contig closure and sequence validation. E. coli O157:H7 is a common pathogen found in contaminated food and water. Our approach obviated the need for the analysis of clones, PCR products, and hybridizations, because maps were constructed from e...

متن کامل

GMcloser: closing gaps in assemblies accurately with a likelihood-based selection of contig or long-read alignments

MOTIVATION Genome assemblies generated with next-generation sequencing (NGS) reads usually contain a number of gaps. Several tools have recently been developed to close the gaps in these assemblies with NGS reads. Although these gap-closing tools efficiently close the gaps, they entail a high rate of misassembly at gap-closing sites. RESULTS We have found that the assembly error rates caused ...

متن کامل

The hidden perils of read mapping as a quality assessment tool in genome sequencing

This article provides a comparative analysis of the various methods of genome sequencing focusing on verification of the assembly quality. The results of a comparative assessment of various de novo assembly tools, as well as sequencing technologies, are presented using a recently completed sequence of the genome of Lactobacillus fermentum 3872. In particular, quality of assemblies is assessed b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Nucleic Acids Research

دوره 33  شماره 

صفحات  -

تاریخ انتشار 2005